Clustering in Newspaper Pages
نویسندگان
چکیده
In the analysis of a newspaper page an important step is the clustering of various text blocks into logical units, i.e., into articles. We propose three algorithms based on text processing techniques to cluster articles in newspaper pages. Based on the complexity of the three algorithms and experimentation on actual pages from the Italian newspaper L’Adige, we select one of the algorithms as the preferred choice to solve the textual clustering problem.
منابع مشابه
Textual Article Clustering in Newspaper Pages
In the analysis of a newspaper page an important step is the clustering of various text blocks into logical units, i.e., into articles. We propose three algorithms based on text processing techniques to cluster articles in newspaper pages. Based on the complexity of the three algorithms and experiment on actual pages from the Italian newspaper L’Adige, we select one of the algorithms as the pre...
متن کاملAn Architecture for Efficient News Items Clustering and Retrieval Based on Language Models for a Dynamic Collection of E- Newspapers
Newspaper pages comprises of multiple individual articles divided into multiple columns. The challenging part of this task is to organize and integrate article blocks in the newspaper. This paper proposes a novel approach for Article reconstruction from newspapersincluding an aggregation of multiple sections of article and reading order recovery of each individual article.Thus,the process combi...
متن کاملReflection of Knowledge and Information Science’s News in the Press: A Case Study of Iran Newspaper
Background and Aim: The present study aims to explore the coverage and reflection of Knowledge and Information Science news in the Iranian press. Iran Newspaper which is one of the main public newspapers in the country has been selected as the case for this study. Method: This study used content analysis as its research methodology and adopted an inductive approach in data analysis. All the pag...
متن کاملFinding Community Base on Web Graph Clustering
Search Pointers organize the main part of the application on the Internet. However, because of Information management hardware, high volume of data and word similarities in different fields the most answers to the user s’ questions aren`t correct. So the web graph clustering and cluster placement in corresponding answers helps user to achieve his or her intended results. Community (web communit...
متن کاملAutomatic Layouting of Personalized Newspaper Pages
Layouting items in a 2D-constrained container for maximizing item value and minimizing wasted space is a 2D Cutting and Packing problem. We consider this task in the context of layouting news articles on fixed-size pages in a system for delivering personalized newspapers. We propose a grid-based page structure where articles can be laid out in different variants for increased flexibility. In ad...
متن کامل